Statistical Inference for Hardy-Weinberg Proportions in the Presence of Missing Genotype Information

نویسندگان

  • Jan Graffelman
  • Milagros Sánchez
  • Samantha Cook
  • Victor Moreno
چکیده

In genetic association studies, tests for Hardy-Weinberg proportions are often employed as a quality control checking procedure. Missing genotypes are typically discarded prior to testing. In this paper we show that inference for Hardy-Weinberg proportions can be biased when missing values are discarded. We propose to use multiple imputation of missing values in order to improve inference for Hardy-Weinberg proportions. For imputation we employ a multinomial logit model that uses information from allele intensities and/or neighbouring markers. Analysis of an empirical data set of single nucleotide polymorphisms possibly related to colon cancer reveals that missing genotypes are not missing completely at random. Deviation from Hardy-Weinberg proportions is mostly due to a lack of heterozygotes. Inbreeding coefficients estimated by multiple imputation of the missings are typically lowered with respect to inbreeding coefficients estimated by discarding the missings. Accounting for missings by multiple imputation qualitatively changed the results of 10 to 17% of the statistical tests performed. Estimates of inbreeding coefficients obtained by multiple imputation showed high correlation with estimates obtained by single imputation using an external reference panel. Our conclusion is that imputation of missing data leads to improved statistical inference for Hardy-Weinberg proportions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exact Inference for Hardy-Weinberg Proportions with Missing Genotypes: Single and Multiple Imputation

This paper addresses the issue of exact-test based statistical inference for Hardy-Weinberg equilibrium in the presence of missing genotype data. Missing genotypes often are discarded when markers are tested for Hardy-Weinberg equilibrium, which can lead to bias in the statistical inference about equilibrium. Single and multiple imputation can improve inference on equilibrium. We develop tests ...

متن کامل

Hardy Weinberg Equilibrium Testing and Interpretation: Focus on infection

Hardy-Weinberg equilibrium (HWE) holds when, in a closed population with random mating and without mutation and natural selection, genotype frequencies at any locus is a simple function of allele frequencies. Testing for HWE is now a common practice in population genetics and genetic association studies of non-communicable diseases; however, it is less-regarded, or sometimes miss-interpreted, i...

متن کامل

On the testing of Hardy‐Weinberg proportions and equality of allele frequencies in males and females at biallelic genetic markers

Standard statistical tests for equality of allele frequencies in males and females and tests for Hardy-Weinberg equilibrium are tightly linked by their assumptions. Tests for equality of allele frequencies assume Hardy-Weinberg equilibrium, whereas the usual chi-square or exact test for Hardy-Weinberg equilibrium assume equality of allele frequencies in the sexes. In this paper, we propose ways...

متن کامل

A Powerful Method for Including Genotype Uncertainty in Tests of Hardy-Weinberg Equilibrium

The use of posterior probabilities to summarize genotype uncertainty is pervasive across genotype, sequencing and imputation platforms. Prior work in many contexts has shown the utility of incorporating genotype uncertainty (posterior probabilities) in downstream statistical tests. Typical approaches to incorporating genotype uncertainty when testing Hardy-Weinberg equilibrium tend to lack cali...

متن کامل

How To Perform Meaningful Estimates of Genetic Effects

Although the genotype-phenotype map plays a central role both in Quantitative and Evolutionary Genetics, the formalization of a completely general and satisfactory model of genetic effects, particularly accounting for epistasis, remains a theoretical challenge. Here, we use a two-locus genetic system in simulated populations with epistasis to show the convenience of using a recently developed m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2013